Overview

Dataset statistics

Number of variables22
Number of observations16617
Missing cells133428
Missing cells (%)36.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.9 MiB
Average record size in memory244.0 B

Variable types

Numeric11
Unsupported7
Categorical4

Warnings

danger has constant value "1.0" Constant
operation_car has constant value "28.0" Constant
operation_date has a high cardinality: 2162 distinct values High cardinality
destination_esr is highly correlated with operation_st_esrHigh correlation
operation_st_esr is highly correlated with destination_esrHigh correlation
operation_car is highly correlated with danger and 1 other fieldsHigh correlation
danger is highly correlated with operation_car and 1 other fieldsHigh correlation
adm is highly correlated with operation_car and 1 other fieldsHigh correlation
index_train has 16617 (100.0%) missing values Missing
danger has 16595 (99.9%) missing values Missing
loaded has 16617 (100.0%) missing values Missing
operation_train has 16617 (100.0%) missing values Missing
rod_train has 16617 (100.0%) missing values Missing
ssp_station_esr has 16617 (100.0%) missing values Missing
ssp_station_id has 16617 (100.0%) missing values Missing
weight_brutto has 16617 (100.0%) missing values Missing
df_index has unique values Unique
index_train is an unsupported type, check if it needs cleaning or further analysis Unsupported
loaded is an unsupported type, check if it needs cleaning or further analysis Unsupported
operation_train is an unsupported type, check if it needs cleaning or further analysis Unsupported
rod_train is an unsupported type, check if it needs cleaning or further analysis Unsupported
ssp_station_esr is an unsupported type, check if it needs cleaning or further analysis Unsupported
ssp_station_id is an unsupported type, check if it needs cleaning or further analysis Unsupported
weight_brutto is an unsupported type, check if it needs cleaning or further analysis Unsupported

Reproduction

Analysis started2021-04-16 09:04:25.616998
Analysis finished2021-04-16 09:04:49.049047
Duration23.43 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct16617
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1868957.486
Minimum11780
Maximum4175448
Zeros0
Zeros (%)0.0%
Memory size129.9 KiB
2021-04-16T15:04:49.223046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum11780
5-th percentile121543.2
Q1822630
median1685278
Q32589453
95-th percentile4043559.4
Maximum4175448
Range4163668
Interquartile range (IQR)1766823

Descriptive statistics

Standard deviation1215945.333
Coefficient of variation (CV)0.6506008524
Kurtosis-0.858109855
Mean1868957.486
Median Absolute Deviation (MAD)867049
Skewness0.4419132175
Sum3.105646654 × 1010
Variance1.478523054 × 1012
MonotocityStrictly increasing
2021-04-16T15:04:49.389046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39649281
 
< 0.1%
19285031
 
< 0.1%
12014081
 
< 0.1%
38834621
 
< 0.1%
15200761
 
< 0.1%
40604201
 
< 0.1%
40628841
 
< 0.1%
1241661
 
< 0.1%
23339611
 
< 0.1%
38972311
 
< 0.1%
Other values (16607)16607
99.9%
ValueCountFrequency (%)
117801
< 0.1%
188501
< 0.1%
188541
< 0.1%
188551
< 0.1%
188661
< 0.1%
188681
< 0.1%
188831
< 0.1%
188871
< 0.1%
188911
< 0.1%
188991
< 0.1%
ValueCountFrequency (%)
41754481
< 0.1%
41593871
< 0.1%
41444521
< 0.1%
40715541
< 0.1%
40715451
< 0.1%
40714791
< 0.1%
40713631
< 0.1%
40713281
< 0.1%
40713081
< 0.1%
40710581
< 0.1%

index_train
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing16617
Missing (%)100.0%
Memory size129.9 KiB

length
Real number (ℝ≥0)

Distinct42
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9989420473
Minimum0.79
Maximum1.93
Zeros0
Zeros (%)0.0%
Memory size129.9 KiB
2021-04-16T15:04:49.547045image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0.79
5-th percentile0.79
Q10.83
median1.06
Q31.06
95-th percentile1.31
Maximum1.93
Range1.14
Interquartile range (IQR)0.23

Descriptive statistics

Standard deviation0.1915790065
Coefficient of variation (CV)0.1917819026
Kurtosis4.833377137
Mean0.9989420473
Median Absolute Deviation (MAD)0
Skewness1.812177886
Sum16599.42
Variance0.03670251572
MonotocityNot monotonic
2021-04-16T15:04:49.710046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
1.068410
50.6%
0.834944
29.8%
0.791065
 
6.4%
0.85604
 
3.6%
1.77200
 
1.2%
1.31166
 
1.0%
1159
 
1.0%
1.3124
 
0.7%
1.1116
 
0.7%
1.59102
 
0.6%
Other values (32)727
 
4.4%
ValueCountFrequency (%)
0.791065
 
6.4%
0.824
 
< 0.1%
0.834944
29.8%
0.85604
 
3.6%
0.862
 
< 0.1%
0.874
 
< 0.1%
0.965
 
0.4%
1159
 
1.0%
1.053
 
< 0.1%
1.068410
50.6%
ValueCountFrequency (%)
1.934
 
< 0.1%
1.894
 
< 0.1%
1.853
 
< 0.1%
1.77200
1.2%
1.7352
 
0.3%
1.7240
 
0.2%
1.742
 
0.3%
1.6981
0.5%
1.624
 
< 0.1%
1.675
 
0.5%

car_number
Real number (ℝ≥0)

Distinct8058
Distinct (%)48.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37860856.55
Minimum22134175
Maximum96723176
Zeros0
Zeros (%)0.0%
Memory size129.9 KiB
2021-04-16T15:04:49.882090image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum22134175
5-th percentile30116156.8
Q130851562
median42008094
Q342203547
95-th percentile44335966
Maximum96723176
Range74589001
Interquartile range (IQR)11351985

Descriptive statistics

Standard deviation7617576.23
Coefficient of variation (CV)0.2011992576
Kurtosis20.13440121
Mean37860856.55
Median Absolute Deviation (MAD)4162618
Skewness2.916398387
Sum6.291338533 × 1011
Variance5.802746763 × 1013
MonotocityNot monotonic
2021-04-16T15:04:50.230045image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3784703530
 
0.2%
3784699530
 
0.2%
3768847025
 
0.2%
3781424117
 
0.1%
3784392713
 
0.1%
3768028712
 
0.1%
3767734112
 
0.1%
3767293812
 
0.1%
3783952910
 
0.1%
3781699810
 
0.1%
Other values (8048)16446
99.0%
ValueCountFrequency (%)
221341755
< 0.1%
235978341
 
< 0.1%
240314861
 
< 0.1%
240623581
 
< 0.1%
240975371
 
< 0.1%
240986261
 
< 0.1%
242138031
 
< 0.1%
242869571
 
< 0.1%
243136291
 
< 0.1%
244824651
 
< 0.1%
ValueCountFrequency (%)
967231761
< 0.1%
966998141
< 0.1%
966997801
< 0.1%
966997492
< 0.1%
966997151
< 0.1%
966997071
< 0.1%
966996571
< 0.1%
966996401
< 0.1%
966994181
< 0.1%
966993501
< 0.1%

destination_esr
Real number (ℝ≥0)

HIGH CORRELATION

Distinct373
Distinct (%)2.3%
Missing130
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean911481.9782
Minimum76404
Maximum998100
Zeros0
Zeros (%)0.0%
Memory size129.9 KiB
2021-04-16T15:04:50.415046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum76404
5-th percentile841402
Q1871200
median915502
Q3950807
95-th percentile972806
Maximum998100
Range921696
Interquartile range (IQR)79607

Descriptive statistics

Standard deviation46162.80795
Coefficient of variation (CV)0.05064588117
Kurtosis13.6235286
Mean911481.9782
Median Absolute Deviation (MAD)35799
Skewness-1.067673366
Sum1.502760337 × 1010
Variance2131004838
MonotocityNot monotonic
2021-04-16T15:04:50.607046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
841402712
 
4.3%
863007664
 
4.0%
950807467
 
2.8%
912805374
 
2.3%
963506326
 
2.0%
841604310
 
1.9%
893708300
 
1.8%
889100298
 
1.8%
972806297
 
1.8%
911709277
 
1.7%
Other values (363)12462
75.0%
ValueCountFrequency (%)
764041
 
< 0.1%
2349013
 
< 0.1%
6373011
 
< 0.1%
8021071
 
< 0.1%
8176002
 
< 0.1%
8300039
 
0.1%
830200154
0.9%
8307099
 
0.1%
8312031
 
< 0.1%
8315041
 
< 0.1%
ValueCountFrequency (%)
9981007
 
< 0.1%
99750214
 
0.1%
9963026
 
< 0.1%
99580818
 
0.1%
9955073
 
< 0.1%
9947014
 
< 0.1%
9944007
 
< 0.1%
99330489
0.5%
99310759
0.4%
99240512
 
0.1%

adm
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size990.0 KiB
20.0
16615 
99.0
 
2

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters66468
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20.0
2nd row20.0
3rd row20.0
4th row20.0
5th row20.0
ValueCountFrequency (%)
20.016615
> 99.9%
99.02
 
< 0.1%
2021-04-16T15:04:50.885045image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-16T15:04:50.983045image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
20.016615
> 99.9%
99.02
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
033232
50.0%
.16617
25.0%
216615
25.0%
94
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number49851
75.0%
Other Punctuation16617
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
033232
66.7%
216615
33.3%
94
 
< 0.1%
ValueCountFrequency (%)
.16617
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common66468
100.0%

Most frequent character per script

ValueCountFrequency (%)
033232
50.0%
.16617
25.0%
216615
25.0%
94
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII66468
100.0%

Most frequent character per block

ValueCountFrequency (%)
033232
50.0%
.16617
25.0%
216615
25.0%
94
 
< 0.1%

danger
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)4.5%
Missing16595
Missing (%)99.9%
Memory size649.7 KiB
1.0
22 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters66
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0
ValueCountFrequency (%)
1.022
 
0.1%
(Missing)16595
99.9%
2021-04-16T15:04:51.182087image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-16T15:04:51.258046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
1.022
100.0%

Most occurring characters

ValueCountFrequency (%)
122
33.3%
.22
33.3%
022
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number44
66.7%
Other Punctuation22
33.3%

Most frequent character per category

ValueCountFrequency (%)
122
50.0%
022
50.0%
ValueCountFrequency (%)
.22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common66
100.0%

Most frequent character per script

ValueCountFrequency (%)
122
33.3%
.22
33.3%
022
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII66
100.0%

Most frequent character per block

ValueCountFrequency (%)
122
33.3%
.22
33.3%
022
33.3%

gruz
Real number (ℝ≥0)

Distinct89
Distinct (%)0.5%
Missing130
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean305350.7256
Minimum11005
Maximum999993
Zeros0
Zeros (%)0.0%
Memory size129.9 KiB
2021-04-16T15:04:51.366045image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum11005
5-th percentile236038
Q1236038
median321029
Q3321067
95-th percentile421161
Maximum999993
Range988988
Interquartile range (IQR)85029

Descriptive statistics

Standard deviation94031.56141
Coefficient of variation (CV)0.307946088
Kurtosis8.736833778
Mean305350.7256
Median Absolute Deviation (MAD)84991
Skewness2.508222769
Sum5034317413
Variance8841934541
MonotocityNot monotonic
2021-04-16T15:04:51.575046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2360386169
37.1%
3210673683
22.2%
3210292585
15.6%
3210521136
 
6.8%
693087487
 
2.9%
421034417
 
2.5%
321048404
 
2.4%
232153270
 
1.6%
421161209
 
1.3%
421087192
 
1.2%
Other values (79)935
 
5.6%
(Missing)130
 
0.8%
ValueCountFrequency (%)
110058
 
< 0.1%
811351
 
< 0.1%
811451
 
< 0.1%
811882
 
< 0.1%
911187
 
< 0.1%
920402
 
< 0.1%
920551
 
< 0.1%
9303934
0.2%
9304319
0.1%
1230412
 
< 0.1%
ValueCountFrequency (%)
9999935
 
< 0.1%
9210621
 
< 0.1%
69322787
 
0.5%
6931761
 
< 0.1%
6931613
 
< 0.1%
693087487
2.9%
6910051
 
< 0.1%
6822374
 
< 0.1%
6340891
 
< 0.1%
5422394
 
< 0.1%

loaded
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing16617
Missing (%)100.0%
Memory size129.9 KiB

operation_car
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size990.0 KiB
28.0
16617 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters66468
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row28.0
2nd row28.0
3rd row28.0
4th row28.0
5th row28.0
ValueCountFrequency (%)
28.016617
100.0%
2021-04-16T15:04:51.881046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-16T15:04:51.971051image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
28.016617
100.0%

Most occurring characters

ValueCountFrequency (%)
216617
25.0%
816617
25.0%
.16617
25.0%
016617
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number49851
75.0%
Other Punctuation16617
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
216617
33.3%
816617
33.3%
016617
33.3%
ValueCountFrequency (%)
.16617
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common66468
100.0%

Most frequent character per script

ValueCountFrequency (%)
216617
25.0%
816617
25.0%
.16617
25.0%
016617
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII66468
100.0%

Most frequent character per block

ValueCountFrequency (%)
216617
25.0%
816617
25.0%
.16617
25.0%
016617
25.0%

operation_date
Categorical

HIGH CARDINALITY

Distinct2162
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Memory size1.2 MiB
2020-07-17 05:50:00
 
123
2020-07-20 07:15:00
 
89
2020-07-20 04:00:00
 
85
2020-07-17 07:00:00
 
78
2020-07-14 10:00:00
 
77
Other values (2157)
16165 

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters315723
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1113 ?
Unique (%)6.7%

Sample

1st row2020-07-16 00:10:00
2nd row2020-07-15 20:00:00
3rd row2020-07-16 11:00:00
4th row2020-07-15 20:00:00
5th row2020-07-15 20:00:00
ValueCountFrequency (%)
2020-07-17 05:50:00123
 
0.7%
2020-07-20 07:15:0089
 
0.5%
2020-07-20 04:00:0085
 
0.5%
2020-07-17 07:00:0078
 
0.5%
2020-07-14 10:00:0077
 
0.5%
2020-07-22 02:50:0074
 
0.4%
2020-07-29 07:00:0066
 
0.4%
2020-07-15 20:00:0066
 
0.4%
2020-07-19 08:55:0064
 
0.4%
2020-07-26 11:03:0063
 
0.4%
Other values (2152)15832
95.3%
2021-04-16T15:04:52.271048image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-07-201218
 
3.7%
2020-07-171160
 
3.5%
2020-07-281145
 
3.4%
2020-07-231132
 
3.4%
2020-07-221127
 
3.4%
2020-07-271097
 
3.3%
2020-07-291063
 
3.2%
2020-07-141054
 
3.2%
2020-07-18984
 
3.0%
2020-07-16983
 
3.0%
Other values (913)22271
67.0%

Most occurring characters

ValueCountFrequency (%)
0107815
34.1%
249895
15.8%
-33234
 
10.5%
:33234
 
10.5%
721068
 
6.7%
120078
 
6.4%
16617
 
5.3%
58567
 
2.7%
37267
 
2.3%
45483
 
1.7%
Other values (3)12465
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number232638
73.7%
Dash Punctuation33234
 
10.5%
Other Punctuation33234
 
10.5%
Space Separator16617
 
5.3%

Most frequent character per category

ValueCountFrequency (%)
0107815
46.3%
249895
21.4%
721068
 
9.1%
120078
 
8.6%
58567
 
3.7%
37267
 
3.1%
45483
 
2.4%
84828
 
2.1%
93992
 
1.7%
63645
 
1.6%
ValueCountFrequency (%)
-33234
100.0%
ValueCountFrequency (%)
16617
100.0%
ValueCountFrequency (%)
:33234
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common315723
100.0%

Most frequent character per script

ValueCountFrequency (%)
0107815
34.1%
249895
15.8%
-33234
 
10.5%
:33234
 
10.5%
721068
 
6.7%
120078
 
6.4%
16617
 
5.3%
58567
 
2.7%
37267
 
2.3%
45483
 
1.7%
Other values (3)12465
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII315723
100.0%

Most frequent character per block

ValueCountFrequency (%)
0107815
34.1%
249895
15.8%
-33234
 
10.5%
:33234
 
10.5%
721068
 
6.7%
120078
 
6.4%
16617
 
5.3%
58567
 
2.7%
37267
 
2.3%
45483
 
1.7%
Other values (3)12465
 
3.9%

operation_st_esr
Real number (ℝ≥0)

HIGH CORRELATION

Distinct345
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean911542.8644
Minimum830003
Maximum998100
Zeros0
Zeros (%)0.0%
Memory size129.9 KiB
2021-04-16T15:04:52.418083image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum830003
5-th percentile841402
Q1871501
median915502
Q3950807
95-th percentile972806
Maximum998100
Range168097
Interquartile range (IQR)79306

Descriptive statistics

Standard deviation44676.22042
Coefficient of variation (CV)0.04901165065
Kurtosis-1.121618052
Mean911542.8644
Median Absolute Deviation (MAD)35706
Skewness-0.1458225955
Sum1.514710778 × 1010
Variance1995964671
MonotocityNot monotonic
2021-04-16T15:04:52.621055image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
841402721
 
4.3%
863007664
 
4.0%
893708497
 
3.0%
912805374
 
2.3%
963506326
 
2.0%
841604310
 
1.9%
953203306
 
1.8%
972806297
 
1.8%
949104295
 
1.8%
911709281
 
1.7%
Other values (335)12546
75.5%
ValueCountFrequency (%)
8300039
 
0.1%
830200154
0.9%
8307099
 
0.1%
8312031
 
< 0.1%
8315041
 
< 0.1%
8316082
 
< 0.1%
83220613
 
0.1%
8323091
 
< 0.1%
8326005
 
< 0.1%
8327042
 
< 0.1%
ValueCountFrequency (%)
9981007
 
< 0.1%
99750214
 
0.1%
9963026
 
< 0.1%
99580818
 
0.1%
9955073
 
< 0.1%
9947014
 
< 0.1%
9944007
 
< 0.1%
993304148
0.9%
99240512
 
0.1%
99200016
 
0.1%

operation_st_id
Real number (ℝ≥0)

Distinct345
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2000532890
Minimum2000035110
Maximum2002023867
Zeros0
Zeros (%)0.0%
Memory size129.9 KiB
2021-04-16T15:04:53.140046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum2000035110
5-th percentile2000035642
Q12000036422
median2000038088
Q32001930550
95-th percentile2001933494
Maximum2002023867
Range1988757
Interquartile range (IQR)1894128

Descriptive statistics

Standard deviation833013.5554
Coefficient of variation (CV)0.0004163958312
Kurtosis-0.8208829307
Mean2000532890
Median Absolute Deviation (MAD)1716
Skewness1.085805461
Sum3.324285503 × 1013
Variance6.939115835 × 1011
MonotocityNot monotonic
2021-04-16T15:04:53.373057image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2001930642721
 
4.3%
2001933494664
 
4.0%
2000036018497
 
3.0%
2000036350374
 
2.3%
2000038468326
 
2.0%
2001930646310
 
1.9%
2000038222306
 
1.8%
2000038722297
 
1.8%
2000037980295
 
1.8%
2000036344281
 
1.7%
Other values (335)12546
75.5%
ValueCountFrequency (%)
20000351104
 
< 0.1%
20000351304
 
< 0.1%
2000035162222
1.3%
200003518267
 
0.4%
20000352126
 
< 0.1%
200003522426
 
0.2%
20000352322
 
< 0.1%
20000352521
 
< 0.1%
20000353442
 
< 0.1%
200003543810
 
0.1%
ValueCountFrequency (%)
200202386756
 
0.3%
200193353660
 
0.4%
200193352256
 
0.3%
200193351870
 
0.4%
20019335146
 
< 0.1%
20019335082
 
< 0.1%
2001933504138
 
0.8%
200193350074
 
0.4%
20019334983
 
< 0.1%
2001933494664
4.0%

operation_train
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing16617
Missing (%)100.0%
Memory size129.9 KiB

receiver
Real number (ℝ≥0)

Distinct169
Distinct (%)1.0%
Missing127
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean17705648.07
Minimum0
Maximum99849255
Zeros63
Zeros (%)0.4%
Memory size129.9 KiB
2021-04-16T15:04:53.697050image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile83262
Q11099058
median4733952
Q314999355
95-th percentile68594560
Maximum99849255
Range99849255
Interquartile range (IQR)13900297

Descriptive statistics

Standard deviation24110999.01
Coefficient of variation (CV)1.361768793
Kurtosis1.238463128
Mean17705648.07
Median Absolute Deviation (MAD)4650690
Skewness1.579045294
Sum2.919661367 × 1011
Variance5.813402731 × 1014
MonotocityNot monotonic
2021-04-16T15:04:53.890084image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
149993553818
23.0%
832621184
 
7.1%
10989231101
 
6.6%
4733952801
 
4.8%
1099182752
 
4.5%
67918167650
 
3.9%
68594560615
 
3.7%
1099058538
 
3.2%
1099064513
 
3.1%
1098969440
 
2.6%
Other values (159)6078
36.6%
ValueCountFrequency (%)
063
 
0.4%
832621184
7.1%
1864242
 
< 0.1%
1864761
 
< 0.1%
2823982
 
< 0.1%
105896433
 
0.2%
10590823
 
< 0.1%
108927813
 
0.1%
10892901
 
< 0.1%
108939620
 
0.1%
ValueCountFrequency (%)
998492554
 
< 0.1%
980980481
 
< 0.1%
977735711
 
< 0.1%
962397341
 
< 0.1%
937752251
 
< 0.1%
937735391
 
< 0.1%
921915512
 
< 0.1%
90378185281
1.7%
884942761
 
< 0.1%
87045988107
 
0.6%

rodvag
Real number (ℝ≥0)

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.44803514
Minimum20
Maximum99
Zeros0
Zeros (%)0.0%
Memory size129.9 KiB
2021-04-16T15:04:54.041045image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile40
Q140
median40
Q390
95-th percentile90
Maximum99
Range79
Interquartile range (IQR)50

Descriptive statistics

Standard deviation25.19690318
Coefficient of variation (CV)0.390964645
Kurtosis-1.940986595
Mean64.44803514
Median Absolute Deviation (MAD)20
Skewness0.003313810892
Sum1070933
Variance634.8839298
MonotocityNot monotonic
2021-04-16T15:04:54.176047image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
408152
49.1%
908120
48.9%
20181
 
1.1%
60144
 
0.9%
9511
 
0.1%
704
 
< 0.1%
932
 
< 0.1%
871
 
< 0.1%
961
 
< 0.1%
991
 
< 0.1%
ValueCountFrequency (%)
20181
 
1.1%
408152
49.1%
60144
 
0.9%
704
 
< 0.1%
871
 
< 0.1%
908120
48.9%
932
 
< 0.1%
9511
 
0.1%
961
 
< 0.1%
991
 
< 0.1%
ValueCountFrequency (%)
991
 
< 0.1%
961
 
< 0.1%
9511
 
0.1%
932
 
< 0.1%
908120
48.9%
871
 
< 0.1%
704
 
< 0.1%
60144
 
0.9%
408152
49.1%
20181
 
1.1%

rod_train
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing16617
Missing (%)100.0%
Memory size129.9 KiB

sender
Real number (ℝ≥0)

Distinct168
Distinct (%)1.0%
Missing127
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean18281683.68
Minimum0
Maximum99849255
Zeros75
Zeros (%)0.5%
Memory size129.9 KiB
2021-04-16T15:04:54.343087image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile83262
Q11098969
median4733952
Q321871408
95-th percentile68594560
Maximum99849255
Range99849255
Interquartile range (IQR)20772439

Descriptive statistics

Standard deviation23990613.33
Coefficient of variation (CV)1.312275923
Kurtosis0.7616405423
Mean18281683.68
Median Absolute Deviation (MAD)4650690
Skewness1.4417245
Sum3.014649639 × 1011
Variance5.755495277 × 1014
MonotocityNot monotonic
2021-04-16T15:04:54.523045image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
149993553523
21.2%
832621107
 
6.7%
1098923823
 
5.0%
1098969716
 
4.3%
1099182682
 
4.1%
67918167663
 
4.0%
68594560632
 
3.8%
21871408551
 
3.3%
4733952522
 
3.1%
49721291521
 
3.1%
Other values (158)6750
40.6%
ValueCountFrequency (%)
075
 
0.5%
832621107
6.7%
1864247
 
< 0.1%
105569310
 
0.1%
105896410
 
0.1%
10590185
 
< 0.1%
1059082170
 
1.0%
10686102
 
< 0.1%
10805091
 
< 0.1%
10867421
 
< 0.1%
ValueCountFrequency (%)
998492551
 
< 0.1%
993390281
 
< 0.1%
962397346
 
< 0.1%
937003801
 
< 0.1%
90378185194
1.2%
903504711
 
< 0.1%
873261732
 
< 0.1%
8704598886
0.5%
8493952111
 
0.1%
844804713
 
< 0.1%

ssp_station_esr
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing16617
Missing (%)100.0%
Memory size129.9 KiB

ssp_station_id
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing16617
Missing (%)100.0%
Memory size129.9 KiB

tare_weight
Real number (ℝ≥0)

Distinct133
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean239.3455497
Minimum184
Maximum890
Zeros0
Zeros (%)0.0%
Memory size129.9 KiB
2021-04-16T15:04:54.701051image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum184
5-th percentile184
Q1205
median223
Q3237
95-th percentile510
Maximum890
Range706
Interquartile range (IQR)32

Descriptive statistics

Standard deviation83.95213728
Coefficient of variation (CV)0.350757043
Kurtosis8.520366671
Mean239.3455497
Median Absolute Deviation (MAD)14
Skewness2.98427585
Sum3977205
Variance7047.961354
MonotocityNot monotonic
2021-04-16T15:04:54.865054image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1843983
24.0%
2101810
10.9%
2321387
 
8.3%
2371325
 
8.0%
2301131
 
6.8%
240842
 
5.1%
209739
 
4.4%
225558
 
3.4%
222393
 
2.4%
212358
 
2.2%
Other values (123)4091
24.6%
ValueCountFrequency (%)
1843983
24.0%
1921
 
< 0.1%
1951
 
< 0.1%
19847
 
0.3%
20347
 
0.3%
20426
 
0.2%
205150
 
0.9%
20727
 
0.2%
2084
 
< 0.1%
209739
 
4.4%
ValueCountFrequency (%)
8901
 
< 0.1%
69044
0.3%
6634
 
< 0.1%
6392
 
< 0.1%
62528
0.2%
62012
 
0.1%
6101
 
< 0.1%
5801
 
< 0.1%
57020
 
0.1%
55059
0.4%

weight_brutto
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing16617
Missing (%)100.0%
Memory size129.9 KiB

Interactions

2021-04-16T15:04:27.936929image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:28.071934image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:28.229929image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:28.375933image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:28.512974image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:28.660147image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:28.820955image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:28.962959image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:29.106995image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:29.252924image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:29.402571image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:29.544702image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:29.693110image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:29.838105image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:29.992144image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:30.156081image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:30.344042image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:30.528043image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:30.773040image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:30.916040image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:31.389042image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:31.555053image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:31.718740image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:31.969734image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:32.129740image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:32.307459image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:32.486429image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:32.653401image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:32.821829image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:32.979002image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:33.141002image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:33.281005image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:33.428060image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:33.592060image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:33.730057image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:33.886813image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:34.041819image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:34.211850image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:34.424819image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:34.574852image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:34.728185image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:34.886153image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:35.028192image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:35.247155image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:35.405153image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:35.575526image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:35.779293image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:36.004298image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:36.247299image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:36.604291image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:36.806327image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:36.975294image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:37.145294image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:37.315060image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:37.483766image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:37.674953image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:37.969953image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:38.164952image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:38.352950image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:38.513956image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:38.739952image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:38.913957image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:39.098115image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:39.293155image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:39.465121image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:39.647115image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:39.846122image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:40.241121image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:40.446117image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:40.613117image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:40.795816image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:40.974818image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:41.270818image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:41.443817image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:41.598764image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:41.746729image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:41.914762image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:42.082764image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:42.240263image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:42.382360image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:42.530396image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:42.679591image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:42.838809image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:43.006000image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:43.287930image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:43.440438image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:43.603443image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:43.784203image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:43.937203image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:44.091168image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:44.259173image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:44.396131image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:44.536218image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:44.689214image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:44.839214image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:44.989231image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:45.139285image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:45.305219image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:45.446982image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:45.592875image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:45.738831image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:45.898831image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:46.053762image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:46.221564image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:46.395573image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:46.629567image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:46.873575image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:47.124564image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:47.291565image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:04:47.459566image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Correlations

2021-04-16T15:04:55.053046image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-04-16T15:04:55.403050image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-04-16T15:04:55.739050image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-04-16T15:04:56.080047image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-04-16T15:04:56.299081image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-04-16T15:04:47.818568image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-04-16T15:04:48.349565image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-04-16T15:04:48.673045image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-04-16T15:04:48.846048image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexindex_trainlengthcar_numberdestination_esradmdangergruzloadedoperation_caroperation_dateoperation_st_esroperation_st_idoperation_trainreceiverrodvagrod_trainsenderssp_station_esrssp_station_idtare_weightweight_brutto
011780NaN1.4194541943940105.020.0NaN693087.0NaN28.02020-07-16 00:10:00940105.02.000038e+09NaN20407488.096.0NaN20407488.0NaNNaN240.0NaN
118850NaN1.0694976461862803.020.0NaN321029.0NaN28.02020-07-15 20:00:00862803.02.001931e+09NaN1098923.090.0NaN1098923.0NaNNaN198.0NaN
218854NaN1.0694976131853503.020.0NaN321067.0NaN28.02020-07-16 11:00:00853503.02.001931e+09NaN49721291.090.0NaN83262.0NaNNaN198.0NaN
318855NaN1.0694975745862803.020.0NaN321029.0NaN28.02020-07-15 20:00:00862803.02.001931e+09NaN1098923.090.0NaN1098923.0NaNNaN204.0NaN
418866NaN1.0694978376862803.020.0NaN321029.0NaN28.02020-07-15 20:00:00862803.02.001931e+09NaN1098923.090.0NaN1098923.0NaNNaN198.0NaN
518868NaN1.0694977535862803.020.0NaN321029.0NaN28.02020-07-15 20:00:00862803.02.001931e+09NaN1098923.090.0NaN1098923.0NaNNaN204.0NaN
618883NaN1.0694973989862803.020.0NaN321029.0NaN28.02020-07-15 20:00:00862803.02.001931e+09NaN1098923.090.0NaN1098923.0NaNNaN198.0NaN
718887NaN1.0694973625862803.020.0NaN321029.0NaN28.02020-07-15 20:00:00862803.02.001931e+09NaN1098923.090.0NaN1098923.0NaNNaN204.0NaN
818891NaN1.0694972650862803.020.0NaN321029.0NaN28.02020-07-15 20:00:00862803.02.001931e+09NaN1098923.090.0NaN1098923.0NaNNaN198.0NaN
918899NaN1.0694975505861209.020.0NaN321067.0NaN28.02020-07-15 19:50:00861209.02.001931e+09NaN49721291.090.0NaN49721291.0NaNNaN198.0NaN

Last rows

df_indexindex_trainlengthcar_numberdestination_esradmdangergruzloadedoperation_caroperation_dateoperation_st_esroperation_st_idoperation_trainreceiverrodvagrod_trainsenderssp_station_esrssp_station_idtare_weightweight_brutto
166074071058NaN1.0642482679918407.020.0NaN321067.0NaN28.02020-07-16 00:00:00918407.02.000036e+09NaN14999355.040.0NaN14999355.0NaNNaN210.0NaN
166084071308NaN1.0643137660930305.020.0NaN254159.0NaN28.02020-07-16 10:53:00930305.02.000037e+09NaN87045988.040.0NaN87045988.0NaNNaN220.0NaN
166094071328NaN1.0643041771871200.020.0NaN321067.0NaN28.02020-07-15 20:18:00871200.02.001934e+09NaN4733892.040.0NaN1098923.0NaNNaN210.0NaN
166104071363NaN1.0643025733862803.020.0NaN321029.0NaN28.02020-07-15 20:00:00862803.02.001931e+09NaN1098923.040.0NaN1098923.0NaNNaN210.0NaN
166114071479NaN1.0642716340971201.020.0NaN351043.0NaN28.02020-07-16 06:21:00971201.02.000039e+09NaN14999355.040.0NaN14999355.0NaNNaN209.0NaN
166124071545NaN1.0643568575841402.020.0NaN321067.0NaN28.02020-07-16 05:34:00841402.02.001931e+09NaN83262.040.0NaN83262.0NaNNaN209.0NaN
166134071554NaN1.0643578178930305.020.0NaN414168.0NaN28.02020-07-16 11:21:00930305.02.000037e+09NaN87045988.040.0NaN87045988.0NaNNaN221.0NaN
166144144452NaN1.0061716221851809.020.0NaN232431.0NaN28.02020-07-16 09:20:00851809.02.001931e+09NaN46620009.060.0NaN57897062.0NaNNaN240.0NaN
166154159387NaN1.0060138401840005.020.0NaN232431.0NaN28.02020-07-16 14:44:00840005.02.001931e+09NaN0.060.0NaN0.0NaNNaN240.0NaN
166164175448NaN1.0063856223840005.020.0NaN91118.0NaN28.02020-07-16 13:25:00840005.02.001931e+09NaN0.060.0NaN0.0NaNNaN243.0NaN